Skip to content

fix(api): P1 bundle 2026-05-29 — 7 agent-UX hygiene fixes#178

Merged
mastermanas805 merged 12 commits into
masterfrom
fix/p1-api-bundle-2026-05-29
May 29, 2026
Merged

fix(api): P1 bundle 2026-05-29 — 7 agent-UX hygiene fixes#178
mastermanas805 merged 12 commits into
masterfrom
fix/p1-api-bundle-2026-05-29

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

Summary

Seven P1 fixes for the 2026-05-29 QA wave. All are minimal patches that
fix the exact bug — no new endpoints, no new mechanisms, no scope drift.
Each commit ships its regression test in the same hunk.

BUG-ID Severity What broke Surface fix
BUG-API-013/406 P2 Idempotency 409 envelope missing request_id + agent_action Add fields + agent_action to idempotency.go 409 branch
BUG-API-051 P1 /auth/me 401 lumps missing/malformed/expired/revoked Add error_code sub-field to RequireAuth's 401 envelope
BUG-API-066 P1 OPTIONS preflight allows TRACE (Fiber CORS doesn't validate) New PreflightAllowlist middleware (pre-CORS gate)
BUG-API-067 P1 OPTIONS preflight allows Cookie header Same allowlist middleware also rejects non-allowlisted headers
BUG-AUTH-005 P1 /auth/logout returns 401 without auth (contract says idempotent) Drop RequireAuth; handler returns 200 on missing/invalid creds
BUG-AUTH-006 P1 name_too_long agent_action says "exceeds 64" (real cap = 120 for PAT, 200 for team) Drop hardcoded cap; agent_action points at message
BUG-API-011 P1 ?token= shows "missing its token" (param is ?t=) Read ?token= as fallback alias

Drift discipline (per brief)

  • ✅ No new endpoints. No new tables. No new feature flags.
  • ✅ All fixes < 30 LOC each (excluding tests).
  • ✅ Each fix copies an existing pattern (canonical envelope, registry-iterating contract).
  • ✅ Per CLAUDE.md rule 22 — every contract change touches all surfaces in one PR.
    Here the only contracts touched are: error envelope (error_code sub-field —
    back-compat: top-level error unchanged); /auth/logout route policy (now
    no-auth-required); name_too_long agent_action sentence.
  • ✅ Per CLAUDE.md rule 17 — each commit carries its own coverage block.

Rule 22 surface checklist

  • api/internal/middleware/auth.go — added error_code sub-field (new field, additive)
  • api/internal/middleware/idempotency.go — added request_id + agent_action to 409 envelope (new fields, additive)
  • api/internal/middleware/cors_preflight_allowlist.go — NEW middleware
  • api/internal/router/router.go — drop RequireAuth on /auth/logout; install PreflightAllowlist before fiberCORS
  • api/internal/handlers/auth_logout.go — idempotent path
  • api/internal/handlers/magic_link.go?token= fallback
  • api/internal/handlers/helpers.go — cap-free agent_action for name_too_long

OpenAPI spec edits (/openapi.json) and content/llms.txt follow as P2
follow-ups — error_code is purely additive (clients matching on the
top-level error keyword keep working).

Local verification

  • go vet ./... ✓ clean
  • go test ./internal/middleware/ -count=1 -short ✓ ok
  • go test ./internal/handlers/ -run "TestLogout|TestAgentAction_NameTooLong|TestMagicLink_Callback_AcceptsTokenAlias" -count=1 ✓ ok

Full go test ./internal/handlers/ -short has the usual 13 pre-existing
NATS/customer-DB env-dependent failures on master (same set, same count) —
CI is authoritative per project memory feedback_coverage_measure_per_package_not_dotdotdot.md.

Bugs DEFERRED (drift risk)

  • BUG-API-002 (Invalid HTTP method → raw fasthttp text) — requires fasthttp
    internals or app-level method whitelist before Fiber dispatch; behavioural
    change too broad for hygiene PR.
  • BUG-API-003 (Host: evil.com nginx 404 HTML) — edge/ingress layer,
    not in api repo.
  • BUG-API-004 (/api/v1/<unknown> returns 401) — requires walking
    app.Stack() or a brittle hardcoded prefix list; touches the Group
    middleware order, drift risk too high.
  • BUG-API-005 (Duplicate Content-Length) — nginx layer, not api.
  • BUG-AUTH-008 (magic-link 5/hr cap silent) — UI/UX copy (instanode-web),
    needs frontend update to render "we suppressed 2 sends" message.
  • BUG-AUTH-009 (x-ratelimit-remaining=0 on first request) — report
    may be misread; live RateLimit middleware is daily-counter, the
    "global 100/min" framing in the bug doesn't match the implementation.
    Filing as needs-design-review.
  • BUG-API-063 (growth.deployments_apps=50 vs CLAUDE.md=5) — CLAUDE.md
    docs fix (separate root-level CLAUDE.md edit). Not a code change.
  • BUG-CLI-008 (~/.instant-tokens plaintext) — needs keychain integration =
    new mechanism, brief says skip.

🤖 Generated with Claude Code

mastermanas805 and others added 12 commits May 29, 2026 21:49
…elope (BUG-API-013)

Pre-fix the 409 returned when an agent reused an Idempotency-Key with a
different body carried only {ok, error, message} — the agent had no
request_id to quote to support and no agent_action sentence to render
to the user. Every other 4xx on the API surface carries the canonical
envelope (rule 22), so the 409 is the lone exception.

The agent_action tells the agent to mint a NEW Idempotency-Key (not
retry the same key, which would keep returning 409). retry_after_seconds
is null: the conflict is permanent for this (key, body) pair — only
re-keying resolves it.

Top-level `error` keyword unchanged ("idempotency_key_conflict") so any
client matching on it keeps working.

Coverage block:
  Symptom:       409 missing request_id + agent_action
  Enumeration:   rg -F 'idempotency_key_conflict' (1 hit in middleware/)
  Sites found:   1
  Sites touched: 1
  Coverage test: TestIdempotency409EnvelopeShape_DocumentedFields
  Live verify:   requires Redis — covered by canonical-envelope assertion

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…G-API-051)

RequireAuth's 401 envelope now carries an `error_code` sub-field that
names which sub-case fired:

  - missing_credentials : no Authorization header / non-Bearer scheme
  - malformed_token     : header present but JWT/PAT won't parse
  - expired_token       : JWT parsed cleanly but exp is in the past
  - invalid_claims      : signature valid, uid/tid missing
  - revoked_session     : jti in the session-revocation set

Pre-fix every 401 from this middleware carried `error=unauthorized`
with no sub-classification, so an agent inspecting a 401 from /auth/me,
/api/v1/whoami, or any protected route had no way to distinguish "I
never sent credentials" from "my JWT expired 30s ago" from "the
signature is wrong" — all rendered the same envelope (BUG-API-035/051).

Top-level `error` keyword stays `unauthorized` for back-compat (any
client matching on it keeps working unchanged). The sub-code lands in
the new `error_code` field. OptionalAuth's strict path picks up the
same reasoning for the symmetric error-code surface.

Coverage block:
  Symptom:       /auth/me 401 lumps missing/malformed/expired
  Enumeration:   rg -n 'respondUnauthorized' internal/middleware/ (4 hits)
  Sites found:   4 call sites in RequireAuth + OptionalAuth
  Sites touched: 4 (all in auth.go); rbac.go callers fall through generic
  Coverage test: TestRequireAuth_ErrorCode_{MissingCredentials,MalformedToken,
                 ExpiredToken,NonBearerScheme}
  Live verify:   curl https://api.instanode.dev/auth/me → 401 +
                 jq .error_code  must be "missing_credentials"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…03 (BUG-API-066/067)

Fiber's CORS middleware sets Access-Control-Allow-* response headers
from the static Config but does NOT cross-check the inbound
Access-Control-Request-Method / Access-Control-Request-Headers against
the allowlist. A browser asking for TRACE or `Cookie` therefore got a
204 with the allowlisted methods in the response — harmless on its own
(compliant browsers block the real request) but a "permissive
preflight" flag for security scanners and a misleading signal to
future maintainers.

PreflightAllowlist is a tiny pre-CORS gate. For OPTIONS requests
carrying an Access-Control-Request-Method header, it rejects (403)
when:
  - the requested method is not in the allowed-methods list
  - any requested header is not in the allowed-headers list

The allowlist strings are extracted into named constants
(`corsAllowMethods` / `corsAllowHeaders`) and passed to both the new
middleware and the existing fiberCORS.New(...) — single source of truth,
no drift risk.

Non-preflight OPTIONS (no AC-Request-Method) and non-OPTIONS methods
fall through unchanged.

Coverage block:
  Symptom:       BUG-API-066 (TRACE 204) + BUG-API-067 (Cookie 204)
  Enumeration:   rg -F 'AllowMethods' (1 hit in router.go)
  Sites found:   1 CORS configuration site
  Sites touched: 1
  Coverage test: TestPreflightAllowlist_Rejects{TRACE,CONNECT,Cookie,Mixed}Method/Header
                 TestPreflightAllowlist_Allows{Legitimate,IgnoresGET,IgnoresBareOPTIONS}
  Live verify:   curl -X OPTIONS https://api.instanode.dev/whoami
                   -H "Origin: https://instanode.dev"
                   -H "Access-Control-Request-Method: TRACE" → 403

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…UG-AUTH-005)

OpenAPI says POST /auth/logout is "Idempotent; safe to call without a
valid token." Pre-fix the route was gated by RequireAuth, so:
  - no Authorization header → 401
  - non-Bearer scheme → 401
  - expired JWT → 401
  - wrong-secret JWT → 401

That broke the dashboard's logout-on-expiry path (the local token is
already useless; the dashboard wants the server to clear its row and
move on, not flag a confusing 401).

Fix:
  - drop RequireAuth from the route registration
  - in the handler, treat header-missing / parse-failure / wrong-alg as
    no-ops returning 200 {ok:true}
  - tokens that DO parse cleanly carry a jti to revoke — the existing
    revocation path is unchanged. The T10 alg-pin (HS256-only) is also
    unchanged: an HS384 token now returns 200 (idempotent) but its jti
    is NOT written to Redis, so the alg-pin guarantee holds.

Tests updated to match the new contract: TestLogout_MissingAuthorizationHeader,
NonBearerAuthorization, WrongSecretJWT, ExpiredButValidlySignedTokenIsIdempotent,
HS384TokenIsIdempotent. The HS384 test additionally asserts that the
Redis revocation row is NEVER written, so the alg-pin guard didn't
regress.

Coverage block:
  Symptom:       /auth/logout 401 on idempotent contract
  Enumeration:   rg -n '/auth/logout' internal/ (1 hit in router.go)
  Sites found:   1 route + 1 handler
  Sites touched: 2 (router.go drops RequireAuth; auth_logout.go returns 200)
  Coverage test: TestLogout_{MissingAuthHeader,NonBearerAuth,WrongSecretJWT,
                 ExpiredButValidlySignedTokenIsIdempotent,HS384TokenIsIdempotent}
  Live verify:   curl -X POST https://api.instanode.dev/auth/logout (no auth) → 200

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tion (BUG-AUTH-006)

Pre-fix agent_action said "Tell the user the 'name' field exceeds 64
characters. Shorten it to a short human label (1-64 chars)..." but the
actual cap varies per endpoint:

  - resource names      : 1-64 chars
  - PAT (API key) names : 1-120 chars (api_keys.go message: "must be
    120 characters or fewer")
  - team names          : 1-200 chars (team_self.go message: "must be
    200 characters or fewer")

A user POSTing a 100-char PAT name therefore saw "Field 'name' must be
120 characters or fewer" in `message` AND "exceeds 64 characters" in
agent_action — agent renders contradiction, user opens support ticket.

Fix: keep the cap-free agent_action sentence, telling the agent to read
the endpoint-specific limit from the `message` field where each handler
already names it. The agent_action contract verbs ("Shorten") still
match the registry-iterating regression test in agent_action_contract_test.go.

Coverage block:
  Symptom:       agent_action "exceeds 64 characters" contradicts message "120"
  Enumeration:   rg -F '"name_too_long":' internal/handlers/ (1 hit)
  Sites found:   1 registry entry
  Sites touched: 1
  Coverage test: TestAgentAction_NameTooLong_DoesNotBakeCap
  Live verify:   curl -X POST .../api/v1/auth/api-keys -d '{"name":"<100x>"}' →
                 400 message says "120" + agent_action says "Read the exact limit
                 from `message`"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-011)

GET /auth/email/callback?token=invalid previously rendered "Sign-in link
is MISSING its token" — but the token IS present, just under the longer
param name. Canonical magic-link URLs we emit always use `?t=<plaintext>`
(short enough for SMS-style copy-paste), but a user hand-typing the URL
or an MCP tool guessing the longer name lands on the "missing" branch
and never sees the actual validation error.

Fix: read `?token=` as a fallback only when `?t=` is absent. `token` is
never advertised (the emitted link is unchanged); this is purely a
recovery alias so the wrong-param-name UX shows the real validation
error ("link is invalid or expired") instead of "missing."

Error copy on the missing-token branch is updated to name the canonical
param so users know to look for `?t=...` in their email link.

Coverage block:
  Symptom:       ?token= → "missing its token" even though token is present
  Enumeration:   rg -F 'c.Query("t")' internal/handlers/magic_link.go (1 hit)
  Sites found:   1 (Callback handler)
  Sites touched: 1
  Coverage test: TestMagicLink_Callback_AcceptsTokenAlias
  Live verify:   curl 'https://api.instanode.dev/auth/email/callback?token=bad'
                 → "link is invalid or expired" (validation branch, not "missing")

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cors_preflight_allowlist.go lines 70+88 are defensive 'skip empty token'
branches that the BUG-API-066/067 regression tests don't hit. Add two
tiny tests that pass comma-separated input with stray double-commas so
diff-cover lands at 100% on the new file.

auth.go lines 519-520 (the AuthErrorExpiredToken assignment for
optionalAuth strict mode) are already covered by the existing
TestOptionalAuthStrict_ExpiredToken_Returns401 — verified locally:
'go tool cover' shows hit count 1 on both.

Verified locally: middleware suite passes, line-coverage on the two
new branches confirmed via 'go tool cover -func'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
auth.go lines 528-529 are reached when a JWT with valid signature has
empty uid OR tid claims — the InvalidClaims arm added in PR #178 for
BUG-API-051 sub-codes. Existing strict tests cover Garbage/Expired/
WrongSecret/NonBearer but never construct a valid-sig + empty-claim
combination.

Three subcases added: empty tid, empty uid, both empty. All three
must 401 in OptionalAuthStrict mode.

Verified locally: all 3 tests PASS; coverprofile shows lines 528-529
hit count = 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous push imported jwt/v5 which broke the build. The whole
middleware package uses v4 per go.mod. This commit fixes the import
path; locally verified TestOptionalAuthStrict_*_InvalidClaims all PASS
and coverage shows lines 528-529 hit count = 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three failed pushes in a row reminds the discipline: when modifying
the file, re-run go build before each push. Locally verified green
this time before push.
@mastermanas805 mastermanas805 merged commit 61faeb1 into master May 29, 2026
14 checks passed
@mastermanas805 mastermanas805 deleted the fix/p1-api-bundle-2026-05-29 branch May 29, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant